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Abstract — We study the diffusion behavior of real-time infor- 
mation. Typically, real-time information is valuable only for a 
limited time duration, and hence needs to be delivered before 
its "deadline." Therefore, real-time information is much easier 
to spread among a group of people with frequent interactions 
than between isolated individuals. With this insight, we consider 
a social network which consists of many cliques and information 
can spread quickly within a clique. Furthermore, information can 
also be shared through online social networks, such as Facebook, 
twitter, Youtube, etc. 

We characterize the diffusion of real-time information by 
studying the phase transition behaviors. Capitalizing on the 
theory of inhomogeneous random networks, we show that the 
social network has a critical threshold above which information 
epidemics are very likely to happen. We also theoretically 
quantify the fractional size of individuals that finally receive the 
message. The numerical results indicate that real-time informa- 
tion could be much easier to propagate in a social network when 
large size cliques exist. 

I. Introduction 
A. Motivation and Background 

In today's modem society, people are becoming increasingly 
tied together over social networks. Thanks to online social 
networks, such as Facebook and Twitter, people can share 
messages quickly with their friends ([T|. Meanwhile, a. physical 
information network 121, Q, ||4l based on traditional face- 
to-face interactions still remains an important medium for 
message spreading. Very recent work ||5] has shown that 
different social networks are usually coupled together, and the 
conjoining could greatly facilitate information diffusion. As a 
result, today's hot spot news or fashion behaviors are more 
hkely to generate pronounced influence over the population 
than ever before. 

The main thrust of this study is dedicated to understanding 
the diffusion behavior of real-time information. Typically, the 
real-time information is valuable only for a limited time dura- 
tion 161, and hence needs to be delivered before its "deadline." 
For example, once a limited-time coupon is released from 
Groupon or Dealsea.com, people can share this news either by 
talking to friends or posting it on Facebook. However, people 
would not have much interest on this deal after it is not longer 
available. 

Clearly, due to the timeliness requirement, the potential 
influenced scale of real-time information in a social network 
depends on the speed of message propagation. The faster 
the message passes from one to another, the more people 
can learn this news before it expires. With this insight, in 



order to characterize its diffusion behavior, a key step is to 
quantify how fast the message can spread along different social 
connections. 

In this study, we assume that information could spread 
amongst people through both face-to-face contacts and online 
communications. In an online social network, the message can 
spread quickly over long distance, and hence it is reasonable 
to treat online connections as the same type of Unks regardless 
of real-world distances. 

On the other hand, face-to-face communications are largely 
constrained by distance between individuals. Recent works in 
|4|, |7| have explored the structure of physical information 
network by tracking in-person interactions over the population. 
Their findings indicate that such interactions would give rise 
to a social graph made of a large number of small isolated 
cliques. Each clique stands for a group of people who are 
close to each other The message can spread quickly within 
a clique via frequent interactions, but takes longer time to 
spread across cliques separated by longer distances. Clearly, 
constrained by its deadUne, the real-time information could be 
less likely to propagate across cliques via face-to-face contacts. 
Needless to say, in order to characterize the diffusion behavior 
of real-time information, we need to consider the impact of 
such clique structure, which has not been studied in previous 
works on general information diffusions li5], [[Sjj, |j9j. 

B. Summary of Main Contributions 

We explore the diffusion of real-time information in an over- 
laying social-physical network where the information could 
spread amongst people through both face-to-face contacts 
(physical information network) and online communications 
(online social network). Based on empirical observations in 
f?!, fT\, we assume that the physical information network 
consists of many isolated cliques where each clique represents 
a group of people with frequent face-to-face interactions, e.g., 
family in a house or colleagues in an office. Clearly, the face- 
to-face contacts are less likely to happen across cliques. 

We characterize the information diffusion process by study- 
ing the phase transition behaviors (see Section III-CI for 
details). Specifically, we show that the social network has 
a critical threshold above which information epidemics are 
very likely to happen, i.e., the information can reach a non- 
trivial fraction of individuals. We also quantify the fraction 
of individuals that finally receive the message by computing 
the size of giant component (see Section ITl-CI for details). The 
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Fig. 1 . System model H 

numerical results in Section |V] indicate that real-time infor- 
mation could be much easier to propagate in a social network 
when large size cliques exist. As illustrated in Figure |6] when 
the average clique size increases from 1 to 2, the fraction of 
individuals that receive the message can grow sharply from 
14% to 80%. 

Note that our work here has significant differences from the 
previous studies on information propagation. In |l8], (|9], it is 
assumed that message could propagate at the same speed along 
different social relationships. Clearly, such assumption would 
be inappropriate for the diffusion of real-time information 
that depends on propagation speeds. Very recent work in [5 | 
considered online connections and face-to-face connections for 
general information diffusion, but did not study the impact of 
clique structure on information diffusion. To the best of our 
knowledge, this paper is the first work on the diffusion of real- 
time information with consideration on the clique structure in 
social networks. We believe that our work will offer initial 
steps towards understanding the diffusion behaviors of real- 
time information. 

II. System model 

We consider an overlying social-physical network H that 
consists of a physical information network W and an online 
social network F. The nodes in the graph W represent the 
human beings in the real world. Based on empirical studies in 
El, El, we assume that the graph W consists of many isolated 
cliques where each clique represents a group of closely located 
people, e.g., family in a house or colleagues in an office. 
Meanwhile, each node in W can independently participate the 
online social network F with probability a, and the nodes in 
F stand for their online memberships. Throughout this paper, 
we also refer to the nodes in W and F as "individuals" and 
"online users," respectively. Furthermore, the links connecting 
the nodes in W stand for traditional face-to-face connections, 
while the links in F represent online connections. 

A. Topology Structure in System Model 

In what follows, we specify the topology structure in the 
system model H in Fig. [T] 

Cliques in the physical information network. The phys- 
ical information network has N nodes and the nodes set is 
denoted hy J\f = {1, 2, ..., N}. These nodes are gathered into 



many cliques with different sizes. We expect the clique size 
follows the distribution {/i^,n = 1,2,...,D}, n = 1,2,... A 
where D is the largest possible size. Therefore, an arbitrary 
clique could contain n nodes with probability /z™. We generate 
these cliques as follows: at step t — 1, we randomly choose 
n nodes from the collection Af and create a clique with the 
selected n nodes, where n is a random number following 
the distribution {/ij^',?i = 1,2,...,_D}. We also denote the 
collection of the remaining nodes in J\f by Afi- At each step 
t, we repeat the above procedure to create a new clique from 
the collection TVt-Jli and assume that we can finally generate 
Nc cliques in W d Generally speaking, the existence of large 
size cliques indicates that many individuals are close to each 
other In other words, the clique size distribution {/ijj'} offers 
an abstract characterization of personal distances in W from 
a macroscopic perspective. 

Type-0 (intra-clique) links in W. Since the nodes within 
the same cliques could interact to each other frequently, we 
assume these nodes are fully connected by type-Q links. 

Type-1 (inter-clique) links in W. We assume that a face- 
to-face interaction is still possible between cliques, e.g., a 
person may talk to a remote friend by walking across a long 
distance. Suppose each node can randomly connect to fc"' 
nodes from other cliques through type-1 links where fc"' is 
a random variable drawn independently from the distribution 
{p^,k^O,l,...}. 

Online users and type-2 (online) links. The nodes in 
the online network F represent the online users. As in l|5], 
we assume each online user randomly connects to k-^ online 
neighbors in F, where k^ is a random variable whose dis- 
tribution is drawn independently from {p^,, k = 0, 1, ...}. We 
denote such online connection as type-2 link. Furthermore, we 
draw a virtual type-i link from an online user in F to the actual 
person it corresponds to in the physical information network 
W; this indicates that the two nodes actually correspond to the 
same individual. 

Online users associated with a clique. To avoid confu- 
sions, we say "size-n clique with m online members" when 
referring to the case that a clique contains n individuals and 
only m of them participate in the online social network F. With 
this insight, we can also differentiate among the collection 
of size-n cliques according to their affiliated onUne users. 
Specifically, for the collection of size-n cliques with m online 
members, m < n < D, we assume their fraction size in the 
whole collection of cliques is ^nm- It is easy to see that 
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'Note that the last generated chque may not follow the expected size 
distribution, since there would be only too few nodes left to choose. However, 
such impact on clique size distribution will be negligible if the number of 
cliques is large enough. 

'Throughout this paper, we use "clique in W" and "clique in H" inter- 
changeably, in the sense that the network W is also a part of system model 



Furthermore, for the collection of cliques with m online users, 
their fraction size can be given by 
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B. Information Transmissibility 

The message can propagate at different speeds along dif- 
ferent types of social connections in H. Due to timeliness 
requirement, the real-time information is easier to pass over a 
link that offers fast propagation speed. Therefore, we assign 
each link with a transmissibility as in ||5l, ||9l, i.e., the 
probability that the message can successfully pass through. 

From practical scenarios, we set the transmissibility along 
type-0 link as Tc = 1 since the message spreads quickly within 
a clique. We also define the transmissibilities along type-1 and 
type-2 links as T^ and Tf, respectively. Throughout this paper, 
we say a Unk is occupied if the message can successfully pass 
through that link. Hence, in H each type-1 link is occupied 
independently with probability T^, whereas each type-2 link 
is occupied independently with probability Tf. 

C. Information Cascade 

We give a brief description of the information diffusion 
process in the following. Suppose that the message starts to 
spread from an arbitrary node i in a clique of W. Then, the 
other nodes in this clique will quickly receive that message 
through type-0 links. The message can also propagate to other 
cliques through occupied type-1 and type-2 links. This in turn 
may trigger further message propagation and may eventually 
lead to an information epidemic; i.e., a non-zero fraction of 
individuals may receive the information in the limit A^ — > oo. 

Clearly, an arbitrary individual can spread the information 
to nodes that are reachable from itself via the occupied edges 
of H. Hence, the size of an information outbreak (i.e., the 
number of individuals that are informed) is closely related to 
the size of the connected components of H, which contains 
only the occupied type-1 and type-2 links lIS), JO), lH of 
H. Thus, the information diffusion process considered here 
is equivalent to a heterogeneous bond-percolation process 
over the network H; the corresponding bond percolation is 
heterogeneous since the occupation probabilities are different 
for type-1 and type-2 links. In this paper, we will exploit this 
relation and find the condition and the size of information 
epidemics by studying the phase transition properties of H. A 
key observation is that the system H exhibits a phase transition 
behavior at a critical threshold. Specifically, a giant connected 
component Gh that covers a non-trivial fraction of nodes in 
H is likely to appear above the critical threshold meaning 
that information epidemics are possible. Below that critical 
threshold, all components are small indicating that the fraction 
of influenced individuals tends to zero in the large system size 
limit. 

It is easy to see that the influenced individuals and cliques 
correspond to the nodes and cliques in W that are contained 
inside Gh- Hence, we introduce two parameters to evaluate 
the scale of information diffusion: 
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Fig. 2. Equivalent graph E. Nodes {a,b,c,d} in tliis graph corresponds to 
the cliques {a,b,c,d} of the system H in Fig. [T] We assign type-1 and type-2 
links in E according to the same type of links connecting cliques in Fig. [T] 



• Sc- The fractional size of influenced cliques in W. 
Namely, Sc corresponds to the ratio of the number of 
cliques in Gh to the total number of cliques in W. 

• Sn'- The fractional size of influenced individuals in W. 
Namely, 5„ corresponds to the ratio of the number of 
nodes that belong to the cliques in Gh to the total number 
of nodes in W. 

With this insight, we can explore the information diffusion 
process by characterizing the phase transition behavior of the 
giant component Gh- 

III. Equivalent graph: a clique level approach 

In this study, we are particularly interested in the following 
two questions: 

• What is the critical threshold of the system H? In other 
words, under what condition, the information reaches a 
non-trival fraction of the network rather than dying out 
quickly? 

> What is the expected size of an information epidemic? In 

other words, to what fraction of nodes and cliques does 

the information reach? Or, equivalently, what are the sizes 

Sc and Sn^ 

These two questions can be answered by quantifying the phase 

transition behaviors of the random graph H. Due to the clique 

structure in our system model, the techniques employed in 

existing works Q, ID, Q cannot be directly applied here. 

To tackle this challenge, we develop an equivalent random 

graph E that exhibits the same phase transition behavior as the 

original model H. Then, we characterize the phase transition 

behaviors in the graph E by capitalizing on the recent results 

in inhomogeneous random graph [lOl, fT\\. 

We first construct an equivalent graph E based on the 
topology structure of H. Since the nodes within the same 
clique can immediately share the message, we treat each clique 
including affiliated online users as a single virtual node in E. 
Furthermore, we assign type-1 and type-2 links between two 
virtual nodes according to the original connections in H. To 
get a more concrete sense, we depict the equivalent graph 
in Fig. |2] that corresponds to the original model in Fig. [T] It 
is easy to see that the (type-1 and type-2) link degree of a 
virtual node equals the total number of (type-1 and type-2) 
links that are incident on the nodes within the corresponding 
clique. The equivalent graph E is expected to exhibit the same 



phase transition behavior as the original model H since both 
graphs have the similar connectivity structure. In particular, the 
fractional size of the giant component Ge in the equivalent 
graph E (the ratio of the number of nodes in Ge to the number 
of nodes in E) is equal to the aforementioned quantity Sc- 
Thus, with a slight abuse of notation, we use Sc to denote the 
fractional size of Ge- 

The degree of an arbitrary node in E can be represented 
by a two-dimensional vector d = [d^ d-^] where d™ and d-^ 
correspond to the numbers of type-1 and type-2 links incident 
on that node, respectively. For a node in E that corresponds 
to a size-n clique in W, we use K^^ to denote its type-1 
link degree, where X™ is a random variable following the 
distribution {P™fc,fc = 0,1,2,...}. Similarly, for a node in 
E that corresponds to a clique with m online users, we use 
KI^ to denote its type-2 link degree where K^^ follows the 
distribution {P„i^.,k — 0,1,2,...}. It is clear to see that an 
arbitrary node in E has link degree [i j] with probability 
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Let F,[dw] and E[df] be the mean numbers of type-1 and type- 
2 links for a node in E, i.e., E[dw] — X^i^o X]jloP(*'-?)* 
and E[df] = Ei^o SiloP(*'-?)i- We also define E[d^df] = 
Ej'^oE^oP(«'i)«i- Furthermore, let E[(d^)2] and E[(d/)2] 
denote the second moments of the number of type-1 and 
type-2 links for a node in E, respectively; i.e., E[((itu)^] ~ 

E^=oEj°loP(*'J>^ and E[(d/)2] = E^oEj°!LoP(*,i)j^- 

IV. Analytical solutions 

In this section, we analyze information diffusion process by 
characterizing the phase transition behaviors in the equivalent 
random graph E. We present our analytical results in the 
following two steps. We first quantify the conditions for the 
emergence of a giant component as well as the fractional sizes 
5c and Sn for the special case T^, = 1 and Tf = 1. We 
next show that these results can be easily extended to a more 
general case with < T^ < 1 and < T/ < 1. 

A. Special Case: T^ ~ Tf = \ 

We characterize the phase transition behavior of the giant 
component in E by capitalizing the theory of inhomoge- 
neous random graphs ifTOl . lITTl . lfT2l . Specifically, we define 

an = E[((i^) ]/E[d„,] - 1, ai2 = E[du,d/]/E[d„,], 021 = 
E[du,d/]/E[d/] and 022 = E[((i/)^]/E[(i/]-l. Along the same 
line in Q, ifTOl . lfT2l . we have the following result. 
Lemma 4.1: Let 



o- = - f an + 022 + Y (an - a22) + 4ai2a2i j 



(4) 



if cr > 1, with high probability (whp) there exists a giant 
component in E, i.e., a non-trival fraction of nodes in E are 
connected; otherwise, a giant component does no exist in E 
whp. 

The proof of Lemma l4n is relegated to Appendix A. As we 
discussed in Section III-CI the existence of a giant component 



in E indicates that the information can reach a non-trival 
fraction of cliques in H rather than dying out quickly. 

Next, let hi and ft,2 in (0, 1] be given by the smallest solution 
to the following recursive equations: 
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We have the following results on the size and probability 
of an information epidemic. 

Lemma 4.2: The fractional size of the giant component in 
E (equivalently, the fractional size of influenced cliques in W) 
is given by 

D n 

Sc = Y.Y. ^"™ (1 - E[/if " ]E[/if '"]) . (7) 

n— 1 T7i— 

The fractional size of influenced nodes in W is given by 
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with the normaUzation term C — J2 nfin- 

ri=l 

The proof of Lemma 14.21 is relegated to Appendix A. For 
any given set of parameters. Lemma 14.21 reveals the fraction 
of individuals and cliques that are likely to receive an infor- 
mation that is started from an arbitrary individual. Namely, 
an information started from an arbitrary individual gives rise 
to an information epidemic with probability 5„ (attributed to 
the possibility that the arbitrary node belongs to the giant 
component Gh), and reaches a fraction 5„ of nodes in the 
network. Similar conclusions can be drawn in terms of 5c for 
the fraction of cliques that receive the information. 

Note that the condition (01) in Lemma 14.11 depends on the 
first/second moments of d^ and df, which boils down to the 
linear combinations of the first/second moments of fc"" and k^ 
in the following manner: 

D D 

E[d^] = J2 l^>E[kn ndf] = E d.rnE[kf], (9) 



E[d^df] ^J2Y1 l^nmnmE[k'"]E[k^], (10) 

n— 1 m— 1 
D 

E[K)'] = J2^'n ("E[(A:-f ] + in' - n){E[k-]f) , (11) 

n=l 
D 

nidff] = E ^'rn Unikff] + (m^ - m){E[kf]y 



(12) 
According to Lemma 14.21 Sc and 5„ are determined by 

E[/if " ], E[X;f /if " "^], E[/if™] and E[Klh^'"~^] in ©-(El), 
i.e., the integrals with respect to the distributions of K^ and 



K^^ for different n and m. The calculations can be simplified 
by utilizing the following transformations: 
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With the help of (flJt-lfTsl). we only need to calculate the 
integrals with respect to the distributions of fc™ and y . In 
this way, we can find hi and /i2 by numerically solving 
the recursive equations (|5j-(|6]l and compute Sc and Sn from 
©-(Is]), respectively. The detailed derivations of (|9]l-(fT5]) are 
omitted (see details in Appendix B). 

B. General Case: < T^ < 1 and < T/ < 1 

We next generalize Lemma 14.11 and Lemma 14.21 to the 
case < T^, < 1 and < T/ < 1. To this end, we 
maintain the occupied links in the equivalent graph E by 
deleting each type-1 and type-2 edge with probability 1 — T^ 
and 1 — Tf, respectively. Let k"-' and k^ be the occupied 
link degrees (instead of k"^' and k^) with the distributions 
{p^',fc = 0,1,...} and {pl,k = 0,1,...}. According to |9|, 
the generating functions corresponding to fc™ and k^ can be 
given by 

~gix) - g (1 + T^ix - 1)) q{x) = g (1 + Tf{x - 1)) . (16) 

From (l9]l-(fT5Tl. we observe that the critical threshold and 
the giant component size are determined by the distributions 
of k^ and k^ . Therefore, Lemma 143] and Lemma l4~2l still hold 
if we replace the terms associated with fc™ and k^ in dOll-fTSIl 
by those associated with A;"' and k^ , respectively. To this end, 
by using the generating functions ( fTSI l. we find 
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In the same manner, we can compute E[fc-^] and E[(fc'')^]. The 
critical threshold (in the general case) can now be computed 
by replacing E[fc™], E[fc/], E[(yfc"')2], E[(fc/)2] with E[fc"'], 
E[fc^], E[(fc"')2], E[(fc^)2], respectively, in ^-^. 

In order to compute the giant component size, we only need 
to replace the corresponding terms in (IT3l - (IT5i with E[h^ ], 
E[/if ], E[fc"'/i5^™~^] and E[fc-^7if ^^]. By using (O, we have 

E[/^f ] = ~g{hi) = E[(l + T^{hi - l)f^], 



Effc^/if-i] 



[g(/ii)]' = r,„E[fc,„(l+T^,(;ii-l))* 



Similar relations can be obtained for E[ft,j^ ] and E[fc^7ij^ ~^]. 
The size of the giant component (in the general case) can now 
be computed by reporting the updated (fT3Tl-(fT5Tl into (|5)-(l8). 



TABLE I 

The clique size distribution in four scenarios 
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size-1 


size-2 


size-3 


average clique size 
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V. Numerical results and simulations 

In this section, we numerically study the diffusion of real- 
time information by utilizing the analytical results derived in 
Section HV] In particular, we are interested in how the clique 
structure can impact the scale of information epidemic. To 
get a more concrete sense, we compare four system scenarios, 
each with different clique size distribution as illustrated in 
Table |T] 

For the sake of fair comparison, the total number of nodes 
in W is fixed at 12000 in each scenario. From Table |I] we 
can see that the average clique size increases from scenario 
1 to scenario 4, indicating that individuals are getting closer 
to each other We assume that the type-1 link degree for each 
node in W follows a poisson distribution, i.e., P^ = ^ • e^^, 
k = 0,1,2,..., where A is the average type-1 link degree. 
Meanwhile, the type-2 link degree for each online user in F 
follows a power-law distribution with exponential cutoff, i.e., 
Pq = and 



pi = ^^" 



fc = 1,2, . 



(17) 



with the normalization factor C = ^ k 
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Fig. 3. The minimum Tw required for the existence of a giant component 
in E versus Tf in four scenarios. We let A = 1.5, a = 0.1, 7 = 3 and 
r = 10. Each curve corresponds to the boundary of the phase transition in 
one scenario. A giant component is very likely to emerge above the boundary. 

We first compare these scenarios in term of the required 
conditions for the existence of a giant component; in other 
words, in terms of the minimum conditions for an information 
epidemic to take place. We let A — 1.5, a = 0.1, 7 = 3 and 
r = 10. By computing the system's critical threshold, we 
depict in Figure [3] the minimum required value of T^ to have 
a giant component in E versus Tf. Each scenario corresponds 
to a curve in the figure that stands for the boundary of a 
phase transition; above the boundary a giant component is 
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Fig. 4. The empirical probability Pi„f for the existence of giant component 
is plotted. The simulation results are obtained with N = 12000 by averaging 
200 experiments. We let Tf = 0.4 and other parameters follow the same setup 
as in Figure [3] From scenario 1 to scenario 4, the values of Pi^f exhibit a 
.sharp increase at T„ = 0.64, Tu, = 0.4, T„ = 0.35 and Tu, = 0.26, 
respectively. Such shaip increase of Pi^f corresponds to the phase transition. 
These values are in good agreement with the minimum required T^ from the 
corresponding curves in Figure [5] We assume that a giant component exists 
if more than 5% of the cliques are connected. 



very likely to emerge. By the definition of transmissibility, a 
lower required T^, indicates that the system is more likely to 
give rise to a giant component (and thus, to an information 
epidemic). From scenario 1 to scenario 4, this figure clearly 
shows that larger clique sizes lead to smaller values for the 
minimum T^ required for an information epidemic, meaning 
that information epidemics are more likely to take place for 
larger clique sizes. The analytical results of Figure |3] are also 
verified by simulations. For a fixed Tf, the probability of the 
existence of giant component pinf is expected to have a sharp 
increase as T^j approaches to the corresponding minimum 
required value in Figure |3] Indeed, we observe in Figure @]that 
when Tf = 0.4, for each scenario, such sharp transition occur 
at r^, close to the corresponding minimum required value 
obtained in Figure |3] Such sharp increase of pinf corresponds 
to the phase transition, i.e., the giant component could exist 
with high probability above the critical threshold. Therefore, 
the minimum required T^, values obtained via simulations are 
in good agreement with our analysis. 

We next compare these scenarios in terms of the fractional 
sizes of influenced cliques and influenced individuals. For each 
scenario, we plot the fractional size of the giant component 
in E versus Tf in Figure |5] which indicates the fraction of 
cliques that will receive the information. We set Tu, = 0.3, 
X = 2, a = 0.3, 7 = 3 and r =: 10. In this Figure, the 
curves stand for analytical results obtained by (|7), while the 
marked points stand for the simulation results obtained by 
averaging 200 experiments for each set of parameter It is easy 
to check that the analytical results are in good agreement with 
the simulations. Obviously, the fractional size of influenced 
cliques in scenario 4 (with average clique size 2) is much 
larger than that in scenario 1 (with average clique size 1), 
which indicates that large cliques in the social network could 
greatly facilitate the message propagation. 

We finally compare the fractional size of influenced indi- 
viduals in Figure |6] In this figure, the curves stand for the 
fractional size of influenced nodes obtained via dHJ, whereas 



Fig. 5. Fraction size of influenced cliques in W versus Tf. We set T^, = 0.3, 
A = 2, Cf = 0.3, 7 = 3 and F = 10. The curves stand for analytical results 
obtained by (7), while the marked points stand for the simulation results with 
N = 12000 by averaging 200 experiments for each set of parameter The 
analytical results are in good agreement with the simulations. 
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Fig. 6. Fraction size of influenced nodes in W versus Tf . All the parameters 
follow the same setup as in Figure [3] The curves stand for analytical results 
obtained by (s) and the marked points stand for the simulation results 
with N = 12000. The analytical results are in good agreement with the 
simulations. For comparison, we also plot the fraction size of influenced 
cliques in scenario 1 where the each clique has only one node. 



the marked points stand for the simulation results. Similar to 
Figure |5] the information is much easier to propagate in a 
social network when larger size cliques exist. For instance, 
when Tf — 1, the fractional size of individuals that receive 
the message grows sharply from 14% (scenario 1 with average 
clique size 1) to 80% (scenario 4 with average clique size 2). In 
conclusion, the above results agree with a natural conjecture 
that the messages are more influential (i.e., more likely to 
reach a large portion of the population) when people are close 
to each other. 

VI. CONCLUSION 

In this study, we explore the diffusion of real-time infor- 
mation in social networks. We develop an overlaying social- 
physical network that consists of an online social network 
and a physical information network with clique structure. We 
theoretically quantify the condition and the size of information 
epidemics. To the best of our knowledge, this paper is the first 
work on the diffusion of real-time information with consider- 
ation on the clique structure in social networks. We believe 
that our findings will offer initial steps towards understanding 
the diffusion behaviors of real-time information. 



VII. Appendix 



A. Proofs of Lemma \4.1\ and Lemma \4.2\ 

In ifTTl . lfT2]| Soderberg studied the phase transition be- 
haviors of inhomogeneous random graphs where nodes are 
connected by different types of edges. Such graphs are also 
called "colored degree-driven random graphs" in the sense 
that different types of edges correspond to different colors. 
In a graph with r-types of edges, the edge degree of an 
arbitrary node can be represented by a r-dimension vector 
d — [(f ■ ■ ■ d''], where d^ stands for the number of type- 
j edges incident on that node. In our study, the equivalent 
graph E has two types of edges and the degree distribution of 
an arbitrary node is denoted by p{i,j) — P[dw = i,df ~ j]. 
Also, the generating function of degree distribution {p{i,j)} 
can be defined by H{xi,X2) = J2iJ2jPihJ)xiX2. Clearly, 
the multivariable combinatorial moments can be achieved by 
partial differentiation at xi = 1 and X2 = 1, i.e.. 



where 
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E[rf/] 
The spectral radius of J is given by 



o- = - ( On + 022 + Y (an - 022) + 4ai2a2i 

By Q, Uni, im, if a > l, with high probability there exist a 
giant component in the graph E;otherwise, a giant component 
is very less likely to exist in E. Therefore, the condition (|4|i 
in Lemma 14711 is achieved. Furthermore, the fraction size Sc 
equals 1 - 5(1) |5|. By ( fTSl l. we have that 



00 00 
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Let {flfc} denote size distribution of the largest connected 
component that can be reached from an arbitrary node in E, 
whose generating function is defined by g(z) = J^k'^kz'^- 
Furthermore, we define a two-dimension vector h(z) = 
[hi{z), h2{z)], where hi{z) stands for the generating function 
of size distribution of the component connected by type-z 
edges. According to the existing results in |I5|, ifTTl . lfT2l . we 
have that 
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D n 00 00 
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In view of ( fT9] t and ( [20b , we have that 
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g{z) = zY,T.P(''3)hiizyh2izy = zHmz)), (18) 
where h{z) satisfies the following recursive equations 
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The emergence of giant component in E can be checked by 
examining the stability of the recursive equations (fT9^-(l20li at 
the point hi — hi{l) = 1 and /12 = ^2(1) — 1- Along the 
same line as in IfTTl . lfT2l . we define a 2 x 2 Jacobian matrix 
J, i.e.. 



flu ai2 

^21 122 



'■ ^ n—l m— 



- 00 00 

'^^ - i?xTEE^(*'^Wi^r^ 

D n 
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Furthermore, (l2Tt can be rewritten in the following form: 

00 00 £> n 

"^c = E E E E f^n„.pzpiA^ - ^>2)- 
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Clearly, the term in parentheses gives the probability that a 
node with colored degree [d^j — i,df — j] belongs to the 
giant component. In other words, the term in parentheses is 
the expected number of cliques added to the giant cluster by 
a degree [d^ = hdf = j] clique. Hence, summing over all 



such i, j's we get an expression for the expected size of the 
giant cluster (in terms of number of cliques). 

In order to compute the expected giant component size in 
terms of the number of nodes, namely to compute Sn, we can 
modify the above expression such that the term n(l — h\h2) 
gives the expected number of nodes to be included in the giant 
cluster by a degree [d^ — hdf ^ j] clique. In other words, 
with probability (1 — /i^^/ij) the clique under consideration wiU 
belong to the giant component Gh and will bring n nodes to 
the actual giant size S'„. This yields 
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We next have 



1 - ^ 



where the normalized term C makes Sn = 1 at /ii = /i2 = 0. 
Therefore, the conclusions (jT) and (|8]l in Lemma |4~2] have been 
obtained. 

B. Detailed Derivations for Equations ([9t-(l-/5D 

As defined in Section |III1 K^^ is the sum of n independent 
copies of fc™ and iT^ is the sum of m independent copies of 
k^. It follows that 

E[K:^] ^ nE[k^] E[Kl,] ^ mE[kf], 



i=0 j=0 n=l 



nidff] = eEp(*'-?')^'' ^ E A^™^[(^;(.) 
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= E ^™ ('mE[(fc-^y] + (to^ - TO)(E[fc^]' 



We next characterize the generating functions of K^^ and KI^. 
Specifically, the generating functions of the type-1 and type-2 
link degree distribution for a single node in H can be defined 

by 9{x) = E^iPfc^;''' and q{x) = 'ET=iPi^''- Since K^ 
and KI^ are sums of i.i.d. random variables, their generating 
functions of turn out to be 



E[(X, 



w\2] 



nEiik"")^] + {n^ - n){E[k'"]f 



G„{x) = J2 Pnk^'' = [gi^T l<n<D, (22) 
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= mE[{k^)^] + {m^ ~m){E[k^])^. 

In view of this, we can rewrite the first/second moments of 
dw and df as follows: 
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With (I22]l and (|23]l, E[/if"], E[K^h'^" \ E[/if"-] and 



kL~i 



E[KIJt,2 "* ] can boil down to the expected value with 
respect to the distribution of fc™ and k^ as follows. 

E[/if" ] = G„(/^l) = (.9(/ii))" ^ (Ei/if])" 



E[K^r--'] 



ni] 



E[Kih. 



-ffi-ii 



= n(E[/it ]) Eifc'^/i^ -1] 



Ef/iS 



ni— 1 



E[k^'h^2 



References 

[1] A. Mislove, M. Marcon, K.P. Gummadi, P. Draschel, and B. Bhattachar- 

jee. Measurement and analysis of online social networks. In Proceedings 

of the 7th ACM SIGCOMM Conference on Internet measurement, 2007. 
[2] J. Leskovec, L.A. Adamic, and B.A. Huberman. The dynamics of viral 

marketing. ACM Transactions on the Web (TWEB), 1(1):5, 2007. 
[3] L. Isella, J. Stehle, A. Barrat, C. Cattuto, J.F. Pinton, and W. Van den 

Broeck. What's in a crowd? Analysis of face-to-face behavioral 

networks. Journal of Theoretical Biology, 2010. 
[4] K. Zhao, J. Stehle, G. Bianconi, and A. Barrat. Social network dynamics 

of face-to-face interactions. Phys. Rev. E, 83(5):056109, 2011. 
[5] O. Yagan, D. Qian, J. Zhang, and D. Cochran. Conjoining speeds up 

information diffusion in overlaying social-physical networks. Technical 

report, Available online at arXiv:1112.4002vl[cs.SI],. 
[6] J. Yang and J. Leskovec. Modeling information diffusion in implicit 

networks. In Proceedings of the 10th IEEE International Conference on 

Data Mining, 2010. 
[7] J. Stehle, A. Barrat, and G. Bianconi. Dynamical and bursty interactions 

in social networks. Phys. Rev. E, 81(3):035101, 2010. 
[8] M. E. J. Newman, S. H. Strogatz, and D. J. Watts. Random graphs with 

arbitrary degree distiibutions and their applications. Phys. Rev. E, 64(2), 

2001. 
[9] M. E. J. Newman. Spread of epidemic disease on networks. Phys. Rev. 

E, 66(1), 2002. 
[10] B. Soderberg. General formalism for inhomogeneous random graphs. 

Phys. Rev. E, 66(066121), 2002. 
[11] B. Soderberg. Random graphs with hidden color Phys. Rev. E, 

68(015 102(R)), 2003. 
[12] B. Soderberg. Random graph models with hidden color. Acta Phys. Pol. 

B, 34:5085-5102, 2003. 



